Skip to content

[NVVM][MLIR] Remove Pure trait from clock, clock64, globaltimer Ops #147608

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

schwarzschild-radius
Copy link
Contributor

This commit removes Pure trait from clock, clock64 and globaltimer Ops by creating NVVM_NCSpecialRegisterOp class to represent Ops which return non-constant values. This prevents CSE pass from optimizing away redundant uses of them

@llvmbot
Copy link
Member

llvmbot commented Jul 8, 2025

@llvm/pr-subscribers-mlir-llvm

@llvm/pr-subscribers-mlir

Author: Pradeep Kumar (schwarzschild-radius)

Changes

This commit removes Pure trait from clock, clock64 and globaltimer Ops by creating NVVM_NCSpecialRegisterOp class to represent Ops which return non-constant values. This prevents CSE pass from optimizing away redundant uses of them


Full diff: https://github.com/llvm/llvm-project/pull/147608.diff

2 Files Affected:

  • (modified) mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td (+10-3)
  • (added) mlir/test/Dialect/LLVMIR/cse-nvvm.mlir (+37)
diff --git a/mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td b/mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td
index 6895e946b8a45..a0d23853a52dd 100644
--- a/mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td
+++ b/mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td
@@ -159,6 +159,13 @@ class NVVM_SpecialRegisterOp<string mnemonic, list<Trait> traits = []> :
   let assemblyFormat = "attr-dict `:` type($res)";
 }
 
+// NVVM_NCSpecialRegisterOp represents a non-constant special register
+class NVVM_NCSpecialRegisterOp<string mnemonic, list<Trait> traits = []> :
+  NVVM_IntrOp<mnemonic, traits, 1> {
+  let arguments = (ins);
+  let assemblyFormat = "attr-dict `:` type($res)";
+}
+
 class NVVM_SpecialRangeableRegisterOp<string mnemonic, list<Trait> traits = []> :
   NVVM_SpecialRegisterOp<mnemonic,
     !listconcat(traits,
@@ -249,9 +256,9 @@ def NVVM_ClusterDim : NVVM_SpecialRangeableRegisterOp<"read.ptx.sreg.cluster.nct
 
 //===----------------------------------------------------------------------===//
 // Clock registers
-def NVVM_ClockOp : NVVM_SpecialRegisterOp<"read.ptx.sreg.clock">;
-def NVVM_Clock64Op : NVVM_SpecialRegisterOp<"read.ptx.sreg.clock64">;
-def NVVM_GlobalTimerOp : NVVM_SpecialRegisterOp<"read.ptx.sreg.globaltimer">;
+def NVVM_ClockOp : NVVM_NCSpecialRegisterOp<"read.ptx.sreg.clock">;
+def NVVM_Clock64Op : NVVM_NCSpecialRegisterOp<"read.ptx.sreg.clock64">;
+def NVVM_GlobalTimerOp : NVVM_NCSpecialRegisterOp<"read.ptx.sreg.globaltimer">;
 
 //===----------------------------------------------------------------------===//
 // envreg registers
diff --git a/mlir/test/Dialect/LLVMIR/cse-nvvm.mlir b/mlir/test/Dialect/LLVMIR/cse-nvvm.mlir
new file mode 100644
index 0000000000000..8d24c3846f178
--- /dev/null
+++ b/mlir/test/Dialect/LLVMIR/cse-nvvm.mlir
@@ -0,0 +1,37 @@
+// RUN: mlir-opt %s -cse -split-input-file -verify-diagnostics | FileCheck %s
+
+// CHECK-LABEL: @nvvm_special_regs_clock
+llvm.func @nvvm_special_regs_clock() -> !llvm.struct<(i32, i32)> {
+  %0 = llvm.mlir.zero: !llvm.struct<(i32, i32)>
+  // CHECK:  {{.*}} = nvvm.read.ptx.sreg.clock
+  %1 = nvvm.read.ptx.sreg.clock : i32
+  // CHECK:  {{.*}} = nvvm.read.ptx.sreg.clock
+  %2 = nvvm.read.ptx.sreg.clock : i32
+  %4 = llvm.insertvalue %1, %0[0]: !llvm.struct<(i32, i32)>
+  %5 = llvm.insertvalue %2, %4[1]: !llvm.struct<(i32, i32)>
+  llvm.return %5: !llvm.struct<(i32, i32)>
+}
+
+// CHECK-LABEL: @nvvm_special_regs_clock64
+llvm.func @nvvm_special_regs_clock64() -> !llvm.struct<(i64, i64)> {
+  %0 = llvm.mlir.zero: !llvm.struct<(i64, i64)>
+  // CHECK:  {{.*}} = nvvm.read.ptx.sreg.clock64
+  %1 = nvvm.read.ptx.sreg.clock64 : i64
+  // CHECK:  {{.*}} = nvvm.read.ptx.sreg.clock64
+  %2 = nvvm.read.ptx.sreg.clock64 : i64
+  %4 = llvm.insertvalue %1, %0[0]: !llvm.struct<(i64, i64)>
+  %5 = llvm.insertvalue %2, %4[1]: !llvm.struct<(i64, i64)>
+  llvm.return %5: !llvm.struct<(i64, i64)>
+}
+
+// CHECK-LABEL: @nvvm_special_regs_globaltimer
+llvm.func @nvvm_special_regs_globaltimer() -> !llvm.struct<(i64, i64)> {
+  %0 = llvm.mlir.zero: !llvm.struct<(i64, i64)>
+  // CHECK:  {{.*}} = nvvm.read.ptx.sreg.globaltimer
+  %1 = nvvm.read.ptx.sreg.globaltimer : i64
+  // CHECK:  {{.*}} = nvvm.read.ptx.sreg.globaltimer
+  %2 = nvvm.read.ptx.sreg.globaltimer : i64
+  %4 = llvm.insertvalue %1, %0[0]: !llvm.struct<(i64, i64)>
+  %5 = llvm.insertvalue %2, %4[1]: !llvm.struct<(i64, i64)>
+  llvm.return %5: !llvm.struct<(i64, i64)>
+}

@grypp
Copy link
Member

grypp commented Jul 9, 2025

Thanks good catch!

@schwarzschild-radius schwarzschild-radius force-pushed the update_traits_for_clock_ops branch from bd6b00c to ef6165d Compare July 10, 2025 07:04
Copy link
Contributor

@durga4github durga4github left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a minor nit.

@schwarzschild-radius schwarzschild-radius force-pushed the update_traits_for_clock_ops branch 2 times, most recently from ddab115 to f49d36d Compare July 11, 2025 06:18
Comment on lines 156 to 158
// NVVM_PureSpecialRegisterOp represents special register ops that can
// speculated and does not touch memory. These operations are always
// legal to hoist or sink.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need extra comment here. Pure is self-explanatory

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Reverted the comment

Copy link
Member

@grypp grypp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just nit, all good

This commit removes Pure trait from clock, clock64 and globaltimer Ops by creating NVVM_NCSpecialRegisterOp class to represent Ops which return non-constant values. This prevents CSE pass from optimizing away redundant uses of them
@schwarzschild-radius schwarzschild-radius force-pushed the update_traits_for_clock_ops branch from f49d36d to 9c0d0f6 Compare July 11, 2025 06:36
@schwarzschild-radius schwarzschild-radius merged commit 5cd56c9 into llvm:main Jul 11, 2025
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants